A Study of Various Existing Techniques to deal with Pipeline Hazards
Renuka Patel, Sanjay Kumar
Pt. Ravishankar Shukla University, Raipur (Chhattisgarh) 492010 India
*Corresponding Author E-mail: renu.access.me@gmail.com, sanraipur@rediffmail.com
ABSTRACT:
Nowadays improvement in processor’s performance is the outcome of the exploitation of some form of parallelism. The viability of multiprocessor systems depend heavily on the presence of parallelism in the application program. There are number of ways through which parallelism can be achieved (like through multitasking, through multiprogramming, through multithreading, through distributed computing, through time sharing, through space sharing etc.) Pipelining is the one of the way to achieve parallelism in easiest and most economical way. But, Hazard is the major problem in the pipelining. Therefore, various researches have proposed various hazard detection and prevention techniques. This paper presents concise overview of some of the existing hazard detection and prevention techniques.
KEYWORDS: Pipeline Hazards, Techniques.
1. INTRODUCTION:
Parallelism allows the hardware to accelerate execution speed of an application by executing multiple independent operations simultaneously. Executing two or more operations in same time is known as Parallelism. Parallelism is very important for multiprocessor system [1]. To achieve parallelism in easy manner pipelining is a widely used technique [2]. Pipelining is becoming very common in all types of computers [3]. An ideal pipeline gives the speedup equal to the number of pipeline stages. But, practically it is very difficult to achieve the speedup of ideal pipelining, because of Hazards [4]. Hazards are the problem in pipelining caused by inter dependency among the instructions executed in pipelined manner. If hazard is not handled carefully then it can produce unreliable results [5-6]. There are basically 3 types of hazards namely: Data dependent hazards, Structural hazards, and Control hazards/Branch hazards. Data hazards are caused by data dependency. This hazard may be again divided into 4 types: RAR (Read After Read), RAW (Read After Write), WAR (Write After Read), and WAW (Write After Write). Structural hazards are caused by resource conflicts. Control hazards/Brach hazards: Control hazards are caused by branches.
2. Hazard detection and prevention techniques:
One of the pronged problem in pipelining is the detection and prevention of pipeline hazards [7]. A large number of methods have been proposed to detect and avoid pipeline hazards. One of the easiest and simplest method to avoid all types of hazards is giving stalls [8]. Stalls are also known as pipeline bubbles, or delay, or idle cycles, or pipeline break[9-10]. Stall is the simplest method to prevent all types of hazards. Whenever hazard occurred a delay is inserted in the pipeline. Stalls do not need any extra hardware. During stalls NOP (No_OPeration) instructions are executed [11]. Stall acts like a blank instruction and tells the hardware to do not perform any task [12]. Stalls decrease throughput because it increases number of idle cycles or latency in pipelining [13].
Although, pipeline stall is the universal remedy for all types of hazards but, stalls degrades the performance [14-15]. In spite of stall, there are other methods also to handle hazards without affecting the performance. These methods can be mainly categorized into two parts: first category methods are based on softwares where compiler resolve the dependency at compile time and second category methods are based on hardwares where any special hardware resolve the hazards at run time [15].
2.1 Methods based on Softwares:
In software (compiler) based methods, a software i.e. compiler does all the tasks. This is also known as static scheduling. No additional hardware is required to implement these methods because the hazards are resolved by compiler by improvement made in the code itself. Software based methods also do not require extra power and can be performed on any hardware which are capable to implement the pipeline. In software based technique, compiler (also called postprocessor) rearranges the instructions such that two instructions which are dependent on each other should not to be too close. In other words compiler rearranges the instructions in such a way so that gap can increase between successive dependent instructions (interlocked instructions) with maintaining the semantic and dependence relationship of program. In software based methods, compiler is more powerful and it generates independent code for pipelined processor. Some form of static scheduling is used in the modern compiler [15]. Gibbons & Steven S. Muchnick have proposed an algorithm for reordering instructions at compile time which is significantly reduces the number of interlocks occurring when code is executed on any implementation of the subject architecture [16]. According to Smith one of the reason behind the performance losses in high-performance computer systems, is conditional branch instructions (control hazard). This performance loss can minimizes by the predicting of branch result in advance and fetching, decoding, and issuing subsequent instructions, prior the actual result is known. So, Smith have discussed firstly, several prior used static methods to predict branch and analyzed the methods by instruction trace data. Then, new methods are proposed which are based on instruction opcodes and enforced successfully these methods to 6 FORTRAN programs. His new methods provide greater accuracy and more flexibility at low cost [17]. Thomas and James have studied the behavior of branches and presented a simple set of program based heuristics for statically predicting branches and combined these into a single branch prediction heuristic which can performs well for a large and diverse set of programs. In addition to using natural loop analysis to predict branches which control the iteration of loops, they focus on heuristics for predicting non-loop branches, which dominate the dynamic branch count in many programs. These heuristics are local in nature, requiring little program analysis and are effective in terms of coverage and miss rate [18].
Although, it is not possible to detect and prevents all the hazards at compile time by compiler, that is why some researchers have proposed some methods which can detect and prevent hazards at run time. Those methods come under hardware based methods which are described in the following sections.
2.2 Methods based on Hardware:
In hardware based methods a hardware i.e. preprocessor is used to rearranges the instructions. This is comes under dynamic scheduling. Hardware based hazard detection and prevention techniques are based on some hardwares like buffers. These buffers are also known as waiting station or reservation station. In these stations dependent instructions can wait for its required operands and allow the next subsequent instruction to proceed [19]. In dynamic scheduling preprocessor (Hardware) is more powerful and it generates independent instructions for pipelined processor. Tomasulo’s register-tagging Algorithm [19], Scoreboarding [20] make use of dynamic scheduling. Register-tagging scheme is built in IBM 360/91 and scoreboarding is built in CDC 6600 processor [15]. WAR and WAW hazards can be avoided by Register renaming [21]. Davidson, Shar, et al. have used reservation table to detect structural hazards [22]. In the year of 1994, Proebsting and Fraser have proposed a method which can detect very quickly structural hazards. Their method accepts a compact specification of the pipeline and creates a finite state automaton which can detect structural hazard. They claim that their method to detect structural hazard is 5 to 80 times faster than its predecessor methods [23].
RAW hazard can be prevented by data forwarding technique which is given by IBM Engineers. In this technique short-circuiting approach is adopted in which copy of data is forwarded to the all waiting instructions directly [24]. Internal forwarding technique is used to avoid data hazard. Internal forwarding replaces memory transfer by register transfer. This technique replaces unnecessary memory access by register to register transfer because fetching data from register is faster and simpler than fetching data from memory [24]. Register tagging technique is also used to avoid data hazard. Register tagging refers to the use of tagged registers, buffers and reservation stations for exploiting simultaneous activities among multiple arithmetic units. This method use binary bits to show the status of any variable where 1 is used for busy and 0 is used for idle [24]. Emma and Davidson characterize the effects of branches and data dependencies on performance [25]. Tsu and Yale have proposed two-level dynamic branch prediction technique, which alters the branch prediction algorithms on the basis of information collected by run time. Their method to predict a branch is based on the examining the history of last n branches [26]. Butler and Patt have investigate the performance of various dynamic scheduling algorithms [27].
3. CONCLUSION:
This paper summarises some existing techniques to detect and prevent the hazards. One of the simplest technique to deal the hazard problem is giving stalls. But, stalls reduce the efficiency of processor. However, there are others methods also available to resolve hazards which help to retain efficiency. These methods can be broadly divided into two types software based methods and hardware based methods. Software based methods do not require extra hardware and power, but all the time, these methods are not able to detect and prevent all types of hazards efficiently, therefore hardware based methods are developed. But, still hazard can not be detected and prevented totally. Therefore, hazard detection and prevention are still ongoing problems.
4. REFERENCES:
1. Kumar S., “Mathematical Modelling and Simulation of a Buffered Fault Tolerance Double Tree Network”, 15th international conference on advanced computing and communication, IEEE, 2007, pp 422- 426.
2. Shar L.E., " Design and Scheduling of Statically Configured Pipelines", Technical report no 42, Stanford Univ Calif Stanford Electronics Lab, September 1972, pp 1-172.
3. Wastson W. J. "The TI ASC: a highly modular and flexible super computer architecture", Proceedings of the fall joint computer conference, December 5-7, 1972, pp 221-228
4. Forsell M.J.,“Minimal pipeline architecture - an alternative to superscalar architecture” Elsevier Microprocessors and Microsystems 20 (1996), 22 April 1996, pp 277-284.
5. Schneider J., Ferdinand Ch., “Pipeline Behavior Prediction for Superscalar Processors by Abstract Interpretation”, Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems, Vol. 34, Issue 7, July 1999, ACM New York, NY, USA, pp 35 – 44.
6. Pandey A.“ Study of data hazard and control hazard resolution techniques in a simulated five stage pipelined RISC processor”, IEEE Xplore, Inventive Computation Technologies (ICICT), International Conference on 26-27 Aug. 2016 at Coimbatore, India.
7. Huang I., Despain A.M. “Hardware/software resolution of pipeline hazards in pipeline synthesis of instruction set processors” Computer-Aided Design, 1993. ICCAD-93. Digest of Technical Papers., 1993 IEEE/ACM International Conference on 7-11 Nov 1993, at Santa Clara, CA, USA, IEEE Xplore, pp 594-599.
8. Carter Nicholas,” Computer Architecture” Schaum’s Outlines, Tata McGraw-Hill, 2002, pp 123.
9. https://en.wikipedia.org/wiki/Bubble_(computing) accessed on 2-02-18.
10. https://en.wikipedia.org/wiki/Hazard_(computer_architecture)#PIPELINE-FLUSH accessed on 2-02-18
11. Saravanan V., Kothari D. P. and Woungang I.,” An optimizing pipeline stall reduction algorithm for power and performance on multi-core CPUs” Human-centric Computing and Information Sciences , SpringerOpen Journal, 2015, pp 1-13.
12. Lutze S., "Pipelining: Hazards, Methods of Optimization, and a Potential Low-Power Alternative" Senior Thesis submitted to Haverford Computer Science Department Dave Wonnacott, May 4, 2011.
13. Casavant A.E. ,“Balancing structural hazards and hardware cost of pipelined processors”, IEEE Xplore ,Proceedings of European Design and Test Conference ED & TC, held on 6-9 Mar 1995 at Paris France, pp 1-6.
14. Stallings W.,”Computer Organization and Architecture ” Pearson Education 2010, ISBN 978-81-7758-993-1, pp 504-510.
15. Hwang K., “Advanced Computer Architecture” Tata Mc Graw Hill 2001, pp 312-314, 288, 288-290, 288.
16. Gibbons P. B. & Muchnick S. S.,” Efficient Instruction Scheduling for a Pipelined Architecture”, Proceedings of the 1986 SIGPLAN symposium on Compiler construction, ACM 0-89791-197-0/8.6OO-Ool.75, 1986, pp 11-16.
17. Smith J.E., “A Study of Branch Prediction Strategies,” Proceedings of the 4th Annual International Symposium on Computer Architecture (SIGARCH Newsletter) 9(3) ACM and IEEE Computer Society, (May 1981), pp 135-148.
18. Ball Th. and Larus J.R.,"Branch prediction for free", Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation pp 300-313.
19. Tomasulo R.M.“An Efficient Algorithm for Exploiting Multiple Arithmetic Units” IBM Journol of Research ond Development. Jan. 1967, pp 25-33.
20. Smith J.S.," Dynamic Instruction Scheduling and the Astronautics ZS-1 ", Computer , Vol. 22, Issue: 7, July 1989 EEE Computer Society Press Los Alamitos, CA, USA, pp 21-35.
21. Rajaraman V.,”Fundamentals of Computers”, PHI1996 , ISBN-8120310039, pp 210-217.
22. Davidson E.S., Shar L. E.,Thomas A. Th.,Patel J. H., “Effective control for pipelined computers ”.In Spring COMPCON75 Digest of Papers, IEEE Computer Society, Feb. 1975, pp 181–184.
23. Proebsting T, Fraser “Detecting Pipeline Structural Hazards Quickly”, Published in 94 Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on Principles of programming languages, Jan. 16 - 19, ACM, New York, USA ©1994, pp 280-286.
24. Briggs A. and Hwang K., “Computer Architecture and parallel Processing” Mc Graw Hill 2001, pp 203,196,200.
25. Emma P.G., Davidson E.S., “Characterization of branch and data dependencies in programs for evaluating pipeline performance,” IEEE Transactions on Computers, Vol. 36 Issue 7, July 1987, pp 859-875.
26. T. Yeh and Y. Patt, "Two - Level Adaptive Training Branch Prediction," Proceeding 24th Annual ACM/IEEE Symp. and Workshop on Microarchitecture, Nov 1991, PP 51 – 61.
27. Butler M., Patt Y., "An Investigation of the Performance of Various Dynamic Scheduling Techniques ", IEEE Microarchitecture, 1992. MICRO 25, Proceedings of the 25th Annual International Symposium on 1-4 Dec. 1992, at Portland, USA.
|
Received on 14.05.2018 Accepted on 20.07.2018 ©A&V Publications all right reserved Research J. Engineering and Tech. 2018;9(4): 304-306. DOI: 10.5958/2321-581X.2018.00041.7 |
|